Understanding the characteristics of high users of hospital services in Singapore and their associations with healthcare utilisation and mortality: A cluster analysis

Introduction High users of hospital services require targeted healthcare services planning for effective resource allocation due to their high costs. This study aims to segmentize the population in the “Ageing In Place-Community Care Team” (AIP-CCT), a programme for complex patients with high inpatient service use, and examine the association of segment membership and healthcare utilisation and mortality. Methods We analysed 1,012 patients enrolled between June 2016 and February 2017. To identify patient segments, a cluster analysis was performed based on medical complexity and psychosocial needs. Next, multivariable negative binomial regression was performed using patient segments as the predictor, with healthcare and programme utilisation over the 180-day follow-up as outcomes. Multivariate cox proportional hazard regression was applied to assess the time to first hospital admission and mortality between segments within the 180-day follow-up. All models were adjusted for age, gender, ethnicity, ward class, and baseline healthcare utilisation. Results Three distinct segments were identified (Segment 1 (n = 236), Segment 2 (n = 331), and Segment 3 (n = 445)). Medical, functional, and psychosocial needs of individuals were significantly different between segments (p-value<0.001). The rates of hospitalisation in Segments 1 (IRR = 1.63, 95%CI:1.3–2.1) and 2 (IRR = 2.11, 95%CI:1.7–2.6) were significantly higher than in Segment 3 on follow-up. Similarly, both Segments 1 (IRR = 1.76, 95%CI:1.6–2.0) and 2 (IRR = 1.25, 95%CI:1.1–1.4) had higher rates of programme utilisation compared to Segment 3. Patients in Segments 1 (HR = 2.48, 95%CI:1.5–4.1) and 2 (HR = 2.25, 95%CI:1.3–3.6) also had higher mortality on follow-up. Conclusions This study provided a data-based approach to understanding healthcare needs among complex patients with high inpatient services utilisation. Resources and interventions can be tailored according to the differences in needs among segments, to facilitate better allocation.


Introduction
Globally, there is an increasing trend of population ageing with multiple chronic conditions, mental health and medication-related problems, and social vulnerability. This has contributed to a shift towards more complex needs of the aging population [1]. Healthcare delivery systems and organisations that were initially developed in response to acute needs have been facing challenges in integrating care to address the complex needs of patients [2]. Older adults are often high users of inpatient services with repeated hospitalisations, and increasing challenges in care transition have resulted in hospital readmission, mortality, and higher healthcare cost [3]. Previous studies reported high healthcare costs associated with multiple comorbidities, mental health problems, increasing age, and end-of-life care [4][5][6].
The strain on healthcare resources has increased calls for more efficient and improved strategies in healthcare services, as the Organization for Economic Cooperation and Development reported high users of hospital resources with estimates ranging from 10% to 34% of healthcare expenditures [7]. Optimal management with a more targeted, efficient, and coordinated healthcare services by having a more person-centered approach in understanding these "high user" patients should be sought to facilitate better care to address their complex needs and to facilitate better resource allocation [8].
While it is impossible to develop care models for every individual at a population level, understanding the needs around groups of people with similar characteristics could aid in resource allocation and programme planning. Identifying these groups, known as population segments, can allow optimisation of healthcare service planning at population level and develop integrated healthcare system that is more targeted and efficient [9][10][11].
Population segmentation uses variables or characteristics used to assign population to homogeneous groups at various levels including-macro (e.g., general population), meso (e.g., specific population with certain disease or conditions, i.e., diabetic patients), and micro (e.g., individual adverse event risk stratification) [10,12]. Derivation of segments could be done through expert inputs (a priori) or post-hoc using statistical methods applied to empirical data [13,14]. Most population segmentation studies often use healthcare utilisation, medical, and socio-demographic characteristics as the basis of segmentation and fewer studies used functional and social variables [2,15,16]. There is increasing evidence that using medical complexity alone is insufficient in explaining healthcare utilisation [3]. A more comprehensive look to include other domains of psychosocial needs is also considered important in understanding the factors driving healthcare utilisation and guiding risk stratification and segmentation to improve care delivery for these patients [3].
This study segmentizes a high user patient population in the Ageing In Place-Community Care Team (AIP-CCT) programme, a programme for complex patients with high inpatient service use, into segments with distinct characteristics, based on their medical complexity and psychosocial needs. We defined high users as complex patients with high inpatient service usage, specifically, those with three or more hospital admissions in the past 12 months since the enrolment in AIP-CCT programme. We also explore the association between segment membership and healthcare utilisation (hospital admissions, length of hospital stay (LOS), emergency department (ED) visits, AIP-CCT programme utilisation) and mortality at 180-day follow-up.

Sample and data source
We used the National Healthcare Group (NHG) Regional Health System (RHS) database. The database included data on socio-demographics, primary and secondary diagnosis (based on International Classification of Diseases 10th Revisions), public healthcare utilisation (hospital admissions, day surgeries, ED visit), hospital length of stay, and mortality. We also used the AIP-CCT programme administrative database, which included data on the programme's home visit assessment. These include socio-demographic profile, medical, functional and psychosocial assessment, and programme utilisation. We included all patients enrolled into AIP-CCT between June 2016 and February 2017 (n = 1,326). After excluding 312 patients with incomplete data, our study analysed a total of 1,012 AIP-CCT patients. For each patient, the first AIP-CCT home visit was identified as the index time for analysis. The baseline period was defined as the 180 days prior to the index time. The follow-up period was the 180 days immediately following the first home visit assessment.

Programme description
The AIP-CCT programme was a homecare service that delivers multi-disciplinary care to a high user patient population. This population had complex needs with progressive or life-limiting conditions, with high inpatient services use ( Table 1). The high service use refers to three or more hospital admissions in the past 12 months from programme enrolment. The delivery of care was guided by a comprehensive needs assessment of the medical, functional, nursing, and psychosocial profiles of the patient in Singapore Northern region.
The AIP-CCT is a hospital-led programme which offered home visits and tele-consultations to support patients and caregivers in managing chronic diseases for 3 months after being enrolled into the programme. The programme aimed to address the unmet needs of patients Table 1. AIP-CCT programme inclusion and exclusion criteria.

Inclusion criteria Exclusion criteria
Patients to meet at least one of the following: by giving support to the patients/caregivers in managing chronic disease to reduce patients' acute hospital utilisation. It also provided case management to link patients to community services for their social needs. Potential patients referred from hospital inpatient wards, specialist outpatient clinics (SOC), and the ED were triaged and linked to a community team based on their assessed needs. The team comprised of nurses, doctors, physiotherapists, occupational therapists, speech therapists, pharmacists, medical social workers, and healthcare assistants [17]. If there was continuing needs beyond the 3 months period, patients would be referred to community long term Home Medical and Home Nursing Services ran by Social Service Agencies, which are non-profit organisations that provide welfare services and/or services that benefit the community at large [18].

Cluster analysis approach
We performed a cluster analysis to generate non-overlapping population segments. Clustering aimed to partition a set of data points into segments (clusters), so that the data points were more similar to each other than data points in different segments. Medical complexity and psychosocial needs were used as input variables to assign observations into homogeneous segments in the clustering process.
Medical complexity was measured using the Charlson Comorbidity Index (CCI). CCI is widely used as a measure of comorbidity, with higher score indicating higher mortality risk and more severe comorbid conditions [19]. It provides a valid assessment of individual's unique clinical condition [19]. The chronic conditions used to compute CCI included myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, rheumatic disease, mild/moderate/severe liver disease, diabetes with and without chronic complication, hemiplegia or paraplegia, renal disease, any malignancy, metastatic solid tumor, and HIV/AIDS [20].
Psychosocial need was measured using Social Triage (ST) score, which was a composite variable on patient/family social support, patient's mental health, treatment compliance, and patient/family coping response. Patient/family social support was assessed by examining the availability of existing caregiver or formal/informal social support for patient. Patient's mental health was assessed by examining the existence of mental condition e.g., dementia, depression, psychiatric/substance abuse, alcoholism, or other behaviour problem that affected care. Treatment compliance was based on patient's adherence to treatment/care plans. Patient/family coping response was assessed by rating their ability to cope with diseases. Each component of the composite variable has a 3-point rating score, with higher score indicating poorer support, more mental health issues, poor treatment compliance, and poor coping response, respectively. S1 Table describes each component within the ST score. We conducted the cluster analysis using the Partitioning Around Medoids (PAM) approach. PAM is a simple unsupervised machine learning clustering algorithm that groups data into a specified number (k) of clusters (segments). This approach searches for representative observations in the dataset called medoids that are centrally located in clusters, and then assigns all other observations to the closest medoid, in order to create clusters [21,22].

Determining the optimal number of segments
We applied a post-hoc approach to define the optimal number of segments presented within the data set, using the R package NbClust [23]. NbClust provides 30 indices which estimate the optimal number of segments in a data set and proposes the best clustering scheme from different results obtained by varying all combinations of number of segments, distance measures, and clustering methods [23]. We selected the optimal number of segments with a majority rule, which is the number of segments derived from majority of the indices. The final segmentation outcome was assessed by its clinical relevance and interpretability, to evaluate the goodness of clustering algorithm results.

Statistical analyses
Sociodemographic, clinical, functional, and psychosocial variables, and healthcare utilisation pattern were collected at baseline to describe the characteristics of each segment of patients. Chi-square/Fisher exact test and one-way ANOVA/Kruskal-Wallis H tests were used to determine whether there were statistically significant differences in the baseline characteristics across the patient segments for categorical and continuous variables (parametric and nonparametric), respectively.
Next, we examined the association of segment membership and prospective healthcare utilisation and mortality. We performed multivariable negative binomial regression model using the patient segments as the predictor and the number of hospital admissions, LOS, ED visits, and programme utilisation over the 180-day follow-up period as outcomes. In addition, we applied multivariable cox proportional hazard regression model to assess and compare the time to the first hospital admission and mortality between segments within the 180-day follow-up period. All models were adjusted for age, gender, ethnicity, ward class, and baseline healthcare utilisation. Ward class refers to the highest tier of hospital ward where the patient had stayed among his/her all hospital admissions. Different ward classes have different facilities, charges and level of subsidy for hospitalisation costs, but the quality of medical care remains the same in all wards. All analyses were performed in R version 3.6.1 (R Core Team, 2019).

Baseline characteristics of the three patient segments
Among all indices within the NbClust package, majority (9 of 30 indices) proposed 3 as the best number of segments. Following the majority rule, we ran the PAM algorithm with k = 3 in the study population.
There were a total of 1,012 patients across all three segments. The mean age of patients was 75.8 years (Standard Deviation (SD): 12.5). Majority of the patients were female (55.5%), married (48.2%), and of Chinese ethnicity (61.7%). Majority lived in 4 rooms public housing apartment (43%). Table 2 presents the baseline characteristics of the three patient segments.
There were no significant differences in age among segments. The three segments had significantly different medical complexity and psychosocial needs, as reflected by the CCI (p-value = <0.001) and ST score (p-value = <0.001). This reflects the central aim of cluster analysis which is to maximise the distance between clustering variables. Segment 2 had the highest medical complexity with highest mean CCI score 4.5 (SD: 2.0), followed by Segment 1 mean CCI score 2.3 (SD: 2.3) and Segment 3 mean CCI score 0.9 (SD: 0.8). Segment 1 had the highest psychosocial needs as reflected by the highest mean ST score 7.4 (SD: 1.2) with poorer patient/family social support, mental health, treatment compliance and coping response, followed by Segment 2 (mean ST score 4.3 (SD: 0.6)) and Segment 1 (mean ST score 4.2 (SD: 0.4)).
In addition, the non-clustering variables including functional and baseline healthcare utilisation differed significantly. This demonstrated that each segment was largely distinct. The Barthel Activities of Daily Living (ADL) (p-value = <0.001) and Instrumental Activities of Daily Living (IADL) (p-value = <0.001) score were significantly different across segments. Segment 1 had the highest functional limitation with the lowest mean ADL score of 49.1 (SD: 37.4) and lowest mean IADL score 2.2 (SD: 2.8), followed by Segment 2 (mean ADL score 60.8 Table 2. Baseline characteristics of the patient segments.

Healthcare utilisation in the follow-up period
We examined the number of hospital admissions, LOS, ED visits, and SOC visits in the 180-day follow-up period using the multivariable negative binomial model, adjusting for age,  Higher score signifies higher psychosocial needs. e Score-based categorisation: Low risk = 4-7, moderate risk = 8-9, and high risk = 10-12. f Poor = patient without caregiver or with caregiver and unwilling/unable to provide support. Fair = patient with existing caregiver and expresses caregiver burden, but open to options. Strong = patient with existing caregiver/formal or informal social support who is willing to provide care. g Good = family/patient accepts condition and is able to work on issues. Moderate = family/patient accepts condition, is still grieving/anxious but within control. Poor = family/patient is in shock, not accepting condition, grieving intensely/highly anxious.
https://doi.org/10.1371/journal.pone.0288441.t002 gender, ethnicity, ward class, and all baseline healthcare utilisation (i.e., number of ED visits at baseline, number of hospital admissions at baseline, and number of SOC visits at baseline) ( Table 3).

Programme utilisation in the follow-up period
We examined programme utilisation in the 180-day follow-up period using the multivariable negative binomial model adjusting for age, gender, ethnicity, ward class, and baseline healthcare utilisation (Table 3). Compared to Segment 3, the rate of having home visits was 76% more in Segment 1 (IRR = 1.76, 95% CI: 1.6-2.0) and 25% more in Segment 2 (IRR = 1.25, 95% CI: 1.1-1.4).

Time to first hospital admission and mortality
We examined time to first hospital admission and time to mortality during the 180-day follow-up, using cox proportional hazard regression models adjusting for age, gender, ethnicity,

Discussion
The cluster analysis demonstrated in this study identified three patient segments based on medical complexity and psychosocial needs variables. The three segments had significantly different medical complexity and psychosocial needs, as clustering analysis maximised the distance between the segmentation variables. [24] In addition, other non-clustering variables i.e., ADL, IADL, and baseline healthcare utilisation, were also found to differ significantly, demonstrating that each segment was largely distinct. Fig 1 illustrates the differences among segments.

Population characteristics
Of the three segments, Segment 1 had the highest functional dependency and psychosocial needs with 35% of the segment having a total dependence based on ADL score category, lowest IADL score and the highest risk of having psychosocial needs based on the ST score. Segment 2 had the highest medical complexity, reflected by the highest CCI score. Segment 3 had the lowest medical complexity, functional dependency and psychosocial needs, hence this segment was used as a reference group for the multivariable analysis.

Population utilisation and mortality
Our study examined the association between membership across different segments with healthcare utilisation and mortality at 180-day follow-up. We found that the rate of having hospital admission, LOS (in days), ED visit, SOC visit, programme utilisation, and risk of hospital admission and mortality in the follow-up period differed across the three patient segments, suggesting that our segmentation approach could arrive at population segments with varying risk of healthcare utilisation and mortality. Patients in Segment 1 with the highest functional dependency and psychosocial needs had the highest programme utilisation with moderate rate of hospitalisation and ED visit, and the highest risk of dying in the follow-up period. It is noteworthy that this Segment also had moderate medical complexity, with an average CCI score of 2.3 (SD: 2.3). Another study reported that individuals who were physically dependent had higher odds of mortality in the follow-up period as compared to those who were largely independent, even when both groups had similar level of comorbidity [25]. This suggests that physical dependency might serve as a differentiating factor that contributes to higher odds of mortality in the former group. A study amongst older adults with multimorbidity reported that physical functioning acted as a mediator of the multimorbidity and mortality association [26]. Literature also reported that psychosocial factors, e.g., social support, mental health, coping ability, are important predictors of mortality beyond severity of comorbidity. This might relate to the effect of these factors towards a negative influence in physical health. It is plausible that this might involve mechanisms that induced stress, which may affect health in various ways, such as via endocrine and / or immune system [27].
Patients in Segment 2 with the highest medical complexity reflecting more medical needs had the highest risk of healthcare utilisation in the follow-up period. This finding reinforced previous studies on the high medical needs impact on hospitalisations and ED visits [4]. It is noteworthy for this Segment 2 to receive moderate programme utilisation during the followup period. Although the study did not look into the cause, it could be postulated from the high SOC visits that the segment had more unstable medical conditions. Segment 3 as the reference group, had the lowest programme utilisation, healthcare utilisation and mortality in the follow-up period.

Policy and research implications
Previous systematic review on predisposing, enabling and need characteristics of high cost patients included age, ethnicity, socioeconomic conditions, organisational factor (e.g., supply of health services and providers), medical conditions, health, and functional status. In these high-cost patients, inpatient services were most often reported as the main cost driver [4]. Furthermore, it reported that high-cost patients were more likely to die [4]. Our study reinforced the association between the characteristics of high users with healthcare utilisation and mortality.
Previous studies were mostly conducted in the United States and Canada, with a limited studies done in Asia [4][5][6]. It is plausible that there will be a diversity in patients' characteristics and utilisation across different countries due to the difference in the epidemiological and health system factors. In addition, different countries might have different healthcare financing system, that could involve single payer or multi-payer framework.
Our study gave a more in-depth look on profiling the high users of inpatient services in Singapore. Moreover, it utilised the segmentation analysis approach to identify homogeneous population segments with similar characteristics and needs, to inform population health management and to allow for a tailored policy based on the local context.
This study reinforced the need to look at medical complexity, functional, and psychosocial needs as the distinguishing characteristics in segmenting high users of inpatient services into different population segments. In this study, Segment 1 with more functional dependency and psychosocial needs was found to have higher rate of programme utilisation, which included home visits done by either a doctor, nurse, physiotherapist, speech therapist, or occupational therapist. This suggests more needs for healthcare to support them with comprehensive assessment and care from the multi-disciplinary team. Segment 2 with more medical needs had the most need for hospital services. Studies reported that medically complex patients might benefit from more intensive medical models, including ambulatory intensive care units [4,28].
In this study of high users, the ones with the lowest medical, functional and psychosocial needs (Segment 3) comprised of almost half of the total population (44%) despite their relatively lesser demand on both programme and hospital services as compared to the Segments 1 and 2. Further profiling and differentiation of this group might aid in tailoring appropriate care to support them.
This study demonstrated that cluster analysis using administrative data could help identify patient segments with distinct healthcare characteristics and utilisation among patients with complex needs and who are high users of inpatient services. Majority of the studies on population segmentation included general population as the target population for segmentation, while others restricted to those with specific diseases or conditions [29]. This study further defined the target population at risk of adverse events i.e. complex high users of inpatient services, at risk of frequent hospital admission and mortality.
Most population segmentation systems adopt a healthcare utilisation risk-based rather than healthcare needs-based segmentation approach [2,30,31]. There is an increasing sentiment that segmentation based on healthcare utilisation risk is deemed inadequate to inform the development and allocation of healthcare services, as needs for specific healthcare services can occur due to few underlying phenomena, including the presence of morbidity risk, pain/discomfort, dysfunction, risk of mortality, patient's subjective impression, or providers' normative assessment [2]. Healthcare needs-based segmentation approach proposes a more comprehensive approach to individuals as it could provide indications for healthcare interventions that may lessen morbidity risk and address their needs [2]. Policy makers can optimise population health planning by applying healthcare needs-based population segmentation approach as an integral component in assessing population healthcare needs in relation to available services and resource allocation.
This study used CCI and ST as the segmentation variables, which are able to capture a comprehensive view of the individuals' medical and psychosocial conditions, as a proxy of their needs. These variables not only covered the individuals' risk of morbidity, but also their health-related behavior such as treatment compliance, and the wider determinants of health including their support system and coping response.
Most clustering algorithms depend on some assumptions in order to define the number of subgroups present in a data set. Similarly, the PAM approach required the number of clusters (k) to be pre-determined as input. As a consequence, the resulting clustering scheme requires some sort of evaluation as regards its validity and the goodness of clustering algorithm results [23]. Criteria to assess market segmentation have been widely adopted to evaluate the quality of population segmentation in healthcare context [13,32]. These include identifiability/ interpretability (segments should be recognised and interpreted easily), substantiality (each segment should have sufficient size), stability (each segment should be relatively stable over time), and actionability/accessibility (each segments should be easily addressed and targeted with distinctive health intervention strategies) [13,32]. In this study, the segmentation outcomes demonstrated meaningful interpretation and clinical relevance as each segments were distinct in relation to its medical, functional and psychosocial needs pattern. In addition, each segments also differed in their rate and risk of future healthcare utilisation and mortality. These indicate a potential application for health service planning at population level.
There are a few study limitations. First, our study included older patients with high hospital service utilisation and complex chronic conditions, which may limit the generalisability of our findings to a more diverse and general population. Second, as the ST was not widely used and involved subjectivity from provider-rated assessment, its applicability to other populations requires further testing. Further studies might consider assessing the validity and reliability of the tool in different context and population. Third, we did not assess the segments performance against different data set. Hence, external validation and generalisability of the segmentation outcome requires further testing. Fourth, we did not include healthcare utilisation from other RHSs. However, majority of patients in Singapore utilise services within one regional health system [33]. Fifth, we did not conduct any analysis to compare healthcare utilisation and mortality in the excluded high users group, i.e., those discharged to institutional care or other community support services. Future studies might want to look into this population segment, to understand its characteristics, healthcare utilisation pattern, and risk of mortality. Lastly, this study is limited by the relatively short follow-up time. Future studies might consider having longer follow-up period to assess the stability of the segmentation outcome over time and also to track progression of each segment and transition between segments, thus allowing for the evaluation of targeted healthcare interventions and development of preventive healthcare services for each segment.
It is important to align the segmentation approach with the population segmentation objectives, as segmentation outcomes may differ with varied input variables and analytical approaches. Selection of variables for segmentation purposes requires iterative processes with contextual knowledge and expertise in data mining and clinical applicability.

Conclusion
A cluster analysis approach in segmenting the population of complex high users of inpatient services using administrative data could identify segments with distinct characteristics and care needs. This study demonstrated the value in capturing a more comprehensive view on factors that influence the healthcare utilisation among complex high users of inpatient services. It provides evidence on healthcare needs-based segments that might potentially help policy makers in allocating resources efficiently by tailoring intervention to meet these needs. It also informs future research on the potential benefit of including data on psychosocial and nursing assessment to characterise complex high users of inpatient services.
Supporting information S1